In this research work, we have demonstrated the application of Mask-RCNN (Regional Convolutional Neural Network), a deep-learning algorithm for computer vision and specifically object detection, to semiconductor defect inspection domain. Stochastic defect detection and classification during semiconductor manufacturing has grown to be a challenging task as we continuously shrink circuit pattern dimensions (e.g., for pitches less than 32 nm). Defect inspection and analysis by state-of-the-art optical and e-beam inspection tools is generally driven by some rule-based techniques, which in turn often causes to misclassification and thereby necessitating human expert intervention. In this work, we have revisited and extended our previous deep learning-based defect classification and detection method towards improved defect instance segmentation in SEM images with precise extent of defect as well as generating a mask for each defect category/instance. This also enables to extract and calibrate each segmented mask and quantify the pixels that make up each mask, which in turn enables us to count each categorical defect instances as well as to calculate the surface area in terms of pixels. We are aiming at detecting and segmenting different types of inter-class stochastic defect patterns such as bridge, break, and line collapse as well as to differentiate accurately between intra-class multi-categorical defect bridge scenarios (as thin/single/multi-line/horizontal/non-horizontal) for aggressive pitches as well as thin resists (High NA applications). Our proposed approach demonstrates its effectiveness both quantitatively and qualitatively.
translated by 谷歌翻译
This paper describes the 5th edition of the Predicting Video Memorability Task as part of MediaEval2022. This year we have reorganised and simplified the task in order to lubricate a greater depth of inquiry. Similar to last year, two datasets are provided in order to facilitate generalisation, however, this year we have replaced the TRECVid2019 Video-to-Text dataset with the VideoMem dataset in order to remedy underlying data quality issues, and to prioritise short-term memorability prediction by elevating the Memento10k dataset as the primary dataset. Additionally, a fully fledged electroencephalography (EEG)-based prediction sub-task is introduced. In this paper, we outline the core facets of the task and its constituent sub-tasks; describing the datasets, evaluation metrics, and requirements for participant submissions.
translated by 谷歌翻译
In a rapidly flourishing country like Bangladesh, accidents in unmanned level crossings are increasing daily. This study presents a deep learning-based approach for automating level crossing junctions, ensuring maximum safety. Here, we develop a fully automated technique using computer vision on a microcontroller that will reduce and eliminate level-crossing deaths and accidents. A Raspberry Pi microcontroller detects impending trains using computer vision on live video, and the intersection is closed until the incoming train passes unimpeded. Live video activity recognition and object detection algorithms scan the junction 24/7. Self-regulating microcontrollers control the entire process. When persistent unauthorized activity is identified, authorities, such as police and fire brigade, are notified via automated messages and notifications. The microcontroller evaluates live rail-track data, and arrival and departure times to anticipate ETAs, train position, velocity, and track problems to avoid head-on collisions. This proposed scheme reduces level crossing accidents and fatalities at a lower cost than current market solutions. Index Terms: Deep Learning, Microcontroller, Object Detection, Railway Crossing, Raspberry Pi
translated by 谷歌翻译
The Predicting Media Memorability task in the MediaEval evaluation campaign has been running annually since 2018 and several different tasks and data sets have been used in this time. This has allowed us to compare the performance of many memorability prediction techniques on the same data and in a reproducible way and to refine and improve on those techniques. The resources created to compute media memorability are now being used by researchers well beyond the actual evaluation campaign. In this paper we present a summary of the task, including the collective lessons we have learned for the research community.
translated by 谷歌翻译
Current state-of-the-art approaches to text classification typically leverage BERT-style Transformer models with a softmax classifier, jointly fine-tuned to predict class labels of a target task. In this paper, we instead propose an alternative training objective in which we learn task-specific embeddings of text: our proposed objective learns embeddings such that all texts that share the same target class label should be close together in the embedding space, while all others should be far apart. This allows us to replace the softmax classifier with a more interpretable k-nearest-neighbor classification approach. In a series of experiments, we show that this yields a number of interesting benefits: (1) The resulting order induced by distances in the embedding space can be used to directly explain classification decisions. (2) This facilitates qualitative inspection of the training data, helping us to better understand the problem space and identify labelling quality issues. (3) The learned distances to some degree generalize to unseen classes, allowing us to incrementally add new classes without retraining the model. We present extensive experiments which show that the benefits of ante-hoc explainability and incremental learning come at no cost in overall classification accuracy, thus pointing to practical applicability of our proposed approach.
translated by 谷歌翻译
Recent mean field interpretations of learning dynamics in over-parameterized neural networks offer theoretical insights on the empirical success of first order optimization algorithms in finding global minima of the nonconvex risk landscape. In this paper, we explore applying mean field learning dynamics as a computational algorithm, rather than as an analytical tool. Specifically, we design a Sinkhorn regularized proximal algorithm to approximate the distributional flow from the learning dynamics in the mean field regime over weighted point clouds. In this setting, a contractive fixed point recursion computes the time-varying weights, numerically realizing the interacting Wasserstein gradient flow of the parameter distribution supported over the neuronal ensemble. An appealing aspect of the proposed algorithm is that the measure-valued recursions allow meshless computation. We demonstrate the proposed computational framework of interacting weighted particle evolution on binary and multi-class classification. Our algorithm performs gradient descent of the free energy associated with the risk functional.
translated by 谷歌翻译
动态磁共振成像(MRI)是一种流行的医学成像技术,可生成组织和器官内部对比度材料流动的图像序列。但是,仅在少数可行性研究中证明了它在通过食道运动中的成像运动中的应用,并且相对尚未探索。在这项工作中,我们提出了一个称为力学的MRI(MRI-MEC)的计算框架,该计算框架增强了该能力,从而增加了动态MRI在诊断食管疾病中的适用性。菠萝汁用作动态MRI的吞咽对比材料,MRI图像序列被用作MRI-MECH的输入。 MRI-MECH将食道建模为柔性的一维管,弹性管壁遵循线性管定律。然后,通过一维质量和动量保护方程式,通过食道流动。这些方程是使用物理信息的神经网络(PINN)求解的。 PINN最大程度地减少了MRI测量和模型预测之间的差异,以确保始终遵循流体流量问题的物理。 MRI-Mech计算了食管转运期间的流体速度和压力,并通过计算壁刚度和主动弛豫来估计食道健康的机械健康。此外,MRI-Mech预测了在排空过程中有关下食管下括约肌的缺失信息,这证明了其适用于缺少数据或图像分辨率差的方案。除了基于食管机械健康的定量估计值来改善临床决策外,MRI-MECH还可以增强用于应用其他医学成像方式以增强其功能。
translated by 谷歌翻译
柔性章鱼臂具有卓越的能力,可以协调大量自由度并执行复杂的操纵任务。结果,这些系统继续吸引生物学家和机器人的注意力。在本文中,我们开发了一个三维模型的软章鱼臂,配备了生物力学上逼真的肌肉致动。考虑了所有主要肌肉群施加的内力和夫妇。描述了一种能量塑形控制方法来协调肌肉活动,以便在3D空间中掌握和触及。本文的主要贡献是:(i)主要肌肉群建模以引起三维运动; (ii)基于存储的能量功能的肌肉激活的数学公式; (iii)通过在特殊欧几里得组SE中解决优化问题获得的设计特定于任务的平衡配置的计算有效过程(3)。然后,根据优化问题解决方案引起的共同状态变量,对肌肉控制进行迭代计算。该方法在物理准确的软件环境弹性中得到了数值的证明。报告了模拟观察到的章鱼行为的数值实验的结果。
translated by 谷歌翻译
我们提出了在概率密度函数(PDFS)的基础变量(即订单参数)的概率密度函数(PDF)中为胶体自组装的有限的随机最佳控制问题。控制目标是根据将状态PDF从规定的初始概率指标转向最小控制工作的规定终端概率指标的提出的。为了特异性,我们使用文献中的单变量随机状态模型。本文开发的分析和对照合成的计算步骤都推广为仿制药在状态中的多元随机状态动力学,在对照模型中给出了非伴随。我们为相关的最佳控制问题得出了最佳条件。该推导产生一个由三个耦合部分微分方程的系统,以及在初始和终端时间的边界条件。最终的系统是所谓的Schr \“ {O} dinger桥问题的广义实例。然后,我们通过训练物理知识的深神经网络来确定最佳控制策略,其中“物理学”是最优化的派生条件。通过基准胶体自组装问题的数值模拟,该解决方案的性能得到了证明。
translated by 谷歌翻译
观察到对于某些NLP任务,例如语义角色预测或主题拟合估计,随机嵌入性能以及预处理的嵌入方式,我们探索了哪些设置允许并检查大多数学习的编码:语义角色,语义角色,语义角色嵌入或``网络''。我们发现细微的答案,具体取决于任务及其与培训目标的关系。我们研究了多任务学习中的这些表示学习方面,在这些方面,角色预测和角色填充是受监督的任务,而几个主题拟合任务不在模型的直接监督之外。我们观察到某些任务的质量得分与培训数据规模之间的非单调关系。为了更好地理解此观察结果,我们使用这些任务的每个动力版本来分析这些结果。
translated by 谷歌翻译